readme.txt

REPLICATION DATA AND CODE

William N. Evans, Craig Garthwaite and Timothy J. Moore.  

"The White/Black Educational Gap, Stalled Progress, and the Long-term Consequences of the Emergence of Crack Cocaine Markets"

Review of Economics and Statistics

Production Date: July 1, 2016

Programs used: SAS9.2 and Stata13

========================================================================
ANALYSIS OF MORTALITY DATA (/mortality)

The mortality data for this paper are the Multiple Cause of Death (MCOD) data files compiled by the National Center for Health Statistics (NCHS). There are also population data from Cancer SEER.

Please use the code mortality_analysis.sas. It is used to create the following outputs:
� Figure 2
� Figure 4 � Panels A to C
� Table 1
� Table 2
� Table A1 
� Table A2
� Murder rates for ages 20-24 by race and sex (MSA and state) � these are inputs into the analysis of education data (the murder rates are already merged into the relevant stata files)

Please use the code mortality_analysis2.do to create the following outputs:
� Figure 5

This folder contains the following datasets, which are used in the analysis:
� A file that converts the NCHS county codes to FIPS county codes (�fips_nchs_conversion�)
� A file that converts the NCHS county codes to MSA codes (�msa_county_conversion�)
� NCHS county codes merged to MSA identifiers for the mortality data in 1989, 1990, 1994 and 1997 (�nchs_geog89� �nchs_geog90� �nchs_geog94� �nchs_geog97�)
� NCHS county codes for 1994 merged to MSA identifiers with fraction of deaths covered by these counties in 1994 (named �codes1994�)
� A file that attaches MSA names to MSA identifiers �msa_names�
� Single-year-of-age county-level population data for 1969-2014 (zipped �us.1969_2014.singleages.adjusted�)

========================================================================
ANALYSIS OF EDUCATION DATA (/education)

Please use the code education_analysis.do. It is used to create the following outputs:
� Figure 1
� Figure 3
� Table 4
� Table 5
� Table 6
� Table A3 
� Table A4
� Table A5
� Table A6

The American Community Survey analysis is done using education_analysis.sas. It is used to create the following outputs:
� Figure 7

There are two other programs in this folder. The program constructing_pums.sas is the SAS program that generates the individual samples (2000 state, 2000 MSA, etc). It generates SAS data sets, which are translated into the Stata data sets in this folder. The program constructing_acs.sas creates the ACS data in similar fashion.

This folder contains the following datasets:
� 2000 5% PUMS data for all states (�all_states_all_years_2000_pums.dta�)
� 2000 5% PUMS organized by MSA of residence (�current_msa_2000.dta�)
� 2000 5% PUMS organized by state of birth (�sob_2000_pums.dta�)
� 2000 5% PUMS organized by state of residence (�cs_2000_ungrouped.dta�)
� 2000 5% PUMS organized by MSA of residence with murder rates for 15-24 year olds merged in from the mortality analysis(�msa_2000_with_mr_fullset.dta�)
� 2000 5% PUMS organized by state of birth with murder rates for 15-24 year olds merged in from the mortality analysis (�sob_2000_with_mr.dta�)
� 2000 5% PUMS organized by state of residence with murder rates for 15-24 year olds merged in from the mortality analysis (�cs_2000_with_mr.dta�)
� Dataset with identifiers for the core set of 57 MSAs used in much of the analysis (�crosswalk_2.dta�)
� Prison intake rates at the state level (�intake_rates_clean.dta�)
� Data converted from the SAS mortality analysis that has information on when crack cocaine arrives in MSAs (�when_crack_arrives_msa_5_3_2012.dta�)
� Data converted from the SAS mortality analysis that has information on when crack cocaine arrives in states (�when_crack_arrives_5_3_2012.dta�)
� Additional CPS state education controls (�state_educ.dta�)
� Additional CPS state unemployment controls (�unemp_rate_race_sex_all.dta�)
� Additional CPS state demographic controls (�state_demo_characteristics.dta�)
� Additional state controls on mothers and children (�mom_kids_state_march_cps_1.dta�)
� The 2000 5% PUMS data is used to generate the PUMS data used in the analysis ("usa_00075.dat"). 
� The American Community Survey data is used to generate the data used in the analysis ("usa_00084.dat").

========================================================================
ANALYSIS OF MSA CHARACTERISTICS (/msa-characteristics)

Please use the code msa-characteristics.do. It is used to create the following outputs:
� Table 3

This folder contains the following datasets:
� Changes in socioeconomic covariates between 1970 and 1980 from the 5% PUMS (�pums70_80.dta�)
� Covariates constructed from Bureau of Economic Analysis ES-202 data (�es_202_final.dta�)
� Minimum distance to New York, Miami, and Los Angeles (�final_distance_data.dta�)
� Information on when crack cocaine arrives in MSAs taken from the mortality analysis (�crack_arrives_duration_final.dta�)

========================================================================
ANALYSIS OF MSA CHARACTERISTICS (/socioeconomic)

Please use the code msa-characteristics.do. The programs and data are used to create the following outputs:
� Table 7

We provide descriptive information about changes in the family characteristics and economic conditions of black and white families with children aged 0 to 18 from the 1980 to 1990. Please use the following code:
� To generate the first sets of means in the table, use read_kids_march_cps_1.sas
� To generate the rate of return to college, use read_march_cps_college_prem.do and return_college_males.do
� To generate the real K-12 spending from Evans and Corcoran (2010), use corcoran_evans_means.do

This folder contains the following datasets:
� cps_00022.gzp is the data that is used to generate the rate of return to college
� balanced_panel.dta is the data for Corcoran and Evans

